-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle ENOBUFS when writing to VM socket #370
Conversation
When a high throughput is happeing we send a lot of data into the VM socket, the VM has to read it in order to empty the buffer. This is inherently race so it is possible to get ENOBUFS here. In this case just keep trying writing until it works. Fixes containers#367 Signed-off-by: Paul Holzinger <[email protected]>
The test failure doesn't seem related, looks like it relies on a stable TXT entry for wikipedia.org which isn't the case. It should use a domain under our control that is unlikely to change (podman.io or crc.dev maybe?) or the test must setup a local dns server which returns a stable known TXT record. |
@cfergeau PTAL |
Yeah, I hit the test failure while looking at the PR, and came to the same conclusion, especially as there was already a change last month ( 79001db ). I'll add a record to crc.dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix!
test/basic_test.go
Outdated
@@ -77,7 +77,7 @@ var _ = ginkgo.Describe("dns", func() { | |||
ginkgo.It("should resolve TXT for wikipedia.org", func() { | |||
out, err := sshExec("nslookup -query=txt wikipedia.org") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change wikipedia.org
to crc.dev
? It already returns what's expected by this test:
$ nslookup -query=txt crc.dev
[...]
crc.dev text = "v=spf1 include:_mailcust.gandi.net ?all"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, also added text =
part to the match just be be sure it is TXT
// again until it works or we get a different error | ||
// https://github.com/containers/gvisor-tap-vsock/issues/367 | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we should use conn.SetWriteDeadline when there's ENOBUFS
to be sure we're not getting into an infinite loop.
But looks good to me as is, we can refine it further if there are issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is an option I guess, my basic understanding is that could only happen if the reader side (VM process) no longer reads anything from the socket but still keeps the socket open.
Sounds rather unlikely to me but yes if this turn out to cause problems we can add a deadline later
It sems the TXT entry for wikipedia.org changed which is causing the test to fail. Fix it by only checking for a partial match and use a domain that is under our control (crc.dev). Signed-off-by: Paul Holzinger <[email protected]>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cfergeau, Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
/unhold |
a3a9c41
into
containers:main
Thanks @cfergeau Can you create a new release sometime this week? We release podman 5.2 next week and it would be good to get a new gvproxy version into the installer to fix this for podman users. |
Yup, for sure, I'll try to do this today or tomorrow. |
@Luap99 I've made the release and created containers/podman#23387 |
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds. Same solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue, sending corrupted packet to VZ. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously this ended with 2 corrupted packets. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds. Same solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue, sending corrupted packet to VZ. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously this ended with 2 corrupted packets. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds. Same solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue, sending corrupted packet to VZ. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously this ended with 2 corrupted packets. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail correct only for lima vz-socket_vmnet use case. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries.. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue, sending corrupted packet to VZ. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously this ended with 2 corrupted packets. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail correct only for lima vz-socket_vmnet use case. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries.. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue, sending corrupted packet to VZ. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously this ended with 2 corrupted packets. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail correct only for lima vz-socket_vmnet use case. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue to send corrupted packet to vz from the point of the failure. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously would break the protocol and continue to send corrupted packet from the point of the failure. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail of vz when using socket_vmnet. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue to send corrupted packet to vz from the point of the failure. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously would break the protocol and continue to send corrupted packet from the point of the failure. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail of vz when using socket_vmnet. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue to send corrupted packet to vz from the point of the failure. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously would break the protocol and continue to send corrupted packet from the point of the failure. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail of vz when using socket_vmnet. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
We used external package (tcpproxy) for proxying between unix stream and datagram sockets. This package cannot handle ENOBUFS error, expected condition on BSD based systems, and worse, it hides errors and stop forwarding packets silently when write to vz socket fails with ENOBUFS[1]. Fix the issues by replacing tcpproxy with a simpler and more direct implementation that will be easier to maintain. Fixes: - Fix error handling if write to vz datagram socket fail with ENOBUFS. We retry the write until it succeeds with a very short sleep between retries. Similar solution is used in gvisor-tap-vsock[2]. - Fix error handling if we could not read packet header or body from socket_vmnet stream socket. Previously we logged an error and continue to send corrupted packet to vz from the point of the failure. - Fix error handling if writing a packet to socket_vmnet stream socket returned after writing partial packet. Now we handle short writes and write the complete packet. Previously would break the protocol and continue to send corrupted packet from the point of the failure. - Log error if forwarding packets from vz to socket_vmnet or from socket_vmnet to vz failed. Simplification: - Use binary.Read() and binary.Write() to read and write qemu packet header. Visibility: - Make QEMUPacketConn private since it is an implementation detail of vz when using socket_vmnet. Testing: - Add a packet forwarding test covering the happy path in 10 milliseconds. [1] lima-vm/socket_vmnet#39 [2] containers/gvisor-tap-vsock#370 Signed-off-by: Nir Soffer <[email protected]>
When a high throughput is happeing we send a lot of data into the VM socket, the VM has to read it in order to empty the buffer. This is inherently race so it is possible to get ENOBUFS here. In this case just keep trying writing until it works.
Fixes #367